02:00
For coefficient \(\beta_j\):
\[\hat{\beta}_j \pm t_{\alpha/2, n-p} \times \text{se}(\hat{\beta}_j)\]
Where:
- \(t_{\alpha/2, n-p}\) is the critical value from t-distribution
- \(\text{se}(\hat{\beta}_j)\) is the standard error
- \(n - p\) are the degrees of freedom
\(\hat{\boldsymbol{\beta}} \sim N(\boldsymbol{\beta}, \sigma^2(\mathbf{X}^T\mathbf{X})^{-1})\)
For individual coefficient:
\[\hat{\beta}_j \sim N(\beta_j, \sigma^2[(\mathbf{X}^T\mathbf{X})^{-1}]_{jj})\]
\(\hat{\boldsymbol{\beta}} \sim N(\boldsymbol{\beta}, \sigma^2(\mathbf{X}^T\mathbf{X})^{-1})\)
Standardize:
\[\frac{\hat{\beta}_j - \beta_j}{\sigma\sqrt{[(\mathbf{X}^T\mathbf{X})^{-1}]_{jj}}} \sim N(0, 1)\]
Problem: We don’t know \(\sigma\), so we estimate it with \(\hat{\sigma}\)
Result: Replace \(\sigma\) with \(\hat{\sigma}\) and the distribution changes
\[\frac{\hat{\beta}_j - \beta_j}{\hat{\sigma}\sqrt{[(\mathbf{X}^T\mathbf{X})^{-1}]_{jj}}} \sim t_{n-p}\]
\[\text{se}(\hat{\beta}_j) = \hat{\sigma}\sqrt{[(\mathbf{X}^T\mathbf{X})^{-1}]_{jj}}\]
Where:
- \(\hat{\sigma}^2 = \frac{\text{SSE}}{n-p}\) (MSE)
- \([(\mathbf{X}^T\mathbf{X})^{-1}]_{jj}\) is the \(j\)-th diagonal element
Given:
\[(\mathbf{X}^T\mathbf{X})^{-1} = \begin{bmatrix} 0.25 & -0.10 \\ -0.10 & 0.05 \end{bmatrix}\] \[\hat{\sigma}^2 = 16\]
Find: \(\text{se}(\hat{\beta}_1)\)
02:00
Formula:
\[\text{CI}_{95\%} = \hat{\beta}_j \pm t_{0.025, n-p} \times \text{se}(\hat{\beta}_j)\]
Given: \(\hat{\beta}_1 = 3.2\), \(\text{se}(\hat{\beta}_1) = 0.8\), \(n = 30\), \(p = 3\)
Step 1: Find critical value
\(t_{0.025, 27} = 2.052\) (qt(0.975, 27))
Step 2: Construct interval
\[\text{CI}_{95\%} = [3.2 - 2.052\times0.8, 3.2 + 2.052\times0.8] = [1.6, 4.8]\]
What this means:
If we repeated this procedure many times with different samples from the same population, 95% of the intervals would contain the true \(\beta_1\)
What we CANNOT say:
“There is a 95% probability that \(\beta_1\) is in this interval”
(The parameter is fixed, the interval is random!)
Given:
- \(\hat{\beta}_2 = 5.6\)
- \(\text{se}(\hat{\beta}_2) = 1.2\)
- \(n = 25\), \(p = 4\)
- \(t_{0.025, 21} = 2.08\)
Find: 95% confidence interval for \(\hat\beta_2\)
03:00
beta_hat <- coef(model)
se_beta <- summary(model)$coefficients[, "Std. Error"]
n <- length(y)
p <- length(beta_hat)
df <- n - p
t_crit <- qt(0.975, df)
lower <- beta_hat - t_crit * se_beta
upper <- beta_hat + t_crit * se_beta
cbind(lower, upper) lower upper
(Intercept) -0.5803601 0.6803601
x 1.7999393 2.1800607
Using the data below fit a model and get 99% confidence intervals.
03:00
We can use the data to understand its own variability
How it works:
1. Resample your data (with replacement)
2. Fit model to resampled data
3. Record \(\hat{\beta}\)
4. Repeat many times (typically 1000+)
5. Use distribution of \(\hat{\beta}\) values to build CI
1. Percentile method Take quantiles of bootstrap distribution
2. Bootstrap t
Take the SE from bootstrap distribution and use a critical t-value to construct an interval
3. Bias-corrected and accelerated (BCa)
Adjusts for bias and skewness (most accurate)
Algorithm:
1. Generate B bootstrap samples (e.g., B = 1000)
2. For each sample, compute \(\hat{\beta}_j^{(b)}\)
3. Sort all B values
4. Take 2.5th and 97.5th percentiles for 95% CI
CI formula:
\[\text{CI}_{95\%} = [Q_{0.025}, Q_{0.975}]\] where \(Q_\alpha\) is the \(\alpha\)-quantile of bootstrap distribution
For each bootstrap iteration \(b = 1, \ldots, B\):
set.seed(1)
x <- 1:20
y <- 2 + 3*x + rnorm(20, 0, 5)
data <- data.frame(x, y)
model <- lm(y ~ x, data)
beta_hat <- coef(model)
B <- 1000
bootstrap_one <- function(data) {
boot_data <- data[sample(1:nrow(data), replace = TRUE), ]
boot_model <- lm(y ~ x, data = boot_data)
boot_beta_hat <- coef(boot_model)
}
boot_betas <- purrr::map(1:B, ~bootstrap_one(data)) |>
dplyr::bind_rows()
boot_betas# A tibble: 1,000 × 2
`(Intercept)` x
<dbl> <dbl>
1 -2.01 3.28
2 2.64 3.02
3 0.256 3.27
4 -0.131 3.08
5 2.07 3.17
6 1.60 3.11
7 0.584 3.18
8 4.55 2.80
9 4.14 2.86
10 2.25 3.01
# ℹ 990 more rows
boot_ci_intercept <- quantile(boot_betas$`(Intercept)`, c(0.025, 0.975))
boot_ci_slope <- quantile(boot_betas$x, c(0.025, 0.975))
cat("Bootstrap CI for intercept:", round(boot_ci_intercept, 3), "\n")Bootstrap CI for intercept: -1.261 5.968
Bootstrap CI for slope: 2.755 3.351
2.5 % 97.5 %
(Intercept) -2.714035 6.353106
x 2.729458 3.486368
Compute a bootstrap 90% CI for the slope
05:00
Definition: The proportion of confidence intervals that contain the true parameter
Ideal coverage: Should match confidence level
- 95% CI should have 95% coverage
- 90% CI should have 90% coverage
In practice: Coverage can be lower due to:
- Violated assumptions
- Small sample sizes
- Model misspecification
The procedure:
1. Choose true parameter values
2. Generate data from the model
3. Compute confidence interval
4. Check if true value is inside
5. Repeat many times
6. Calculate proportion containing truth
set.seed(1)
n_sims <- 1000
sim <- function(true_beta0 = 2, true_beta1 = 1, sigma = 5, n = 30) {
# Generate data
x <- runif(n, 0, 10)
y <- true_beta0 + true_beta1 * x + rnorm(n, 0, sigma)
# Fit model and get CI
model <- lm(y ~ x)
ci <- confint(model, level = 0.95)[2, ] # CI for slope
# Check coverage
ci[1] <= true_beta1 && true_beta1 <= ci[2]
}
coverage <- purrr::map_lgl(1:n_sims, ~sim())
mean(coverage)[1] 0.957
Our result: ~95% coverage (as expected!)
If coverage is too low (<95%):
- Standard errors may be underestimated
- Assumptions may be violated
- Consider robust methods or bootstrap
If coverage is too high (>95%):
- CIs are conservative (too wide)
- Losing power
- May be okay, but inefficient
Sample size:
- Small n can have poor coverage
- Large n coverage approaches nominal level
Assumption violations:
- Non-normality (less important with large n)
- Heteroscedasticity - Outliers
Model misspecification:
- Wrong functional form
- Missing important variables